161 results found.
Written
Evaluation Tool,
Language Type:
Multilingual
Languages:
English French German Hebrew Russian
Availability:
Freely Available
License:
Apache License, Version 2.0
Size:
62 MByte Production Status:
Newly created-in progress
Use:
Syntactic Evaluation (and Evaluation Set Generators)
-
Paper title:Cross-Linguistic Syntactic Evaluation of Word Prediction Models
-
Paper track:Long/Interpretability and Analysis of Models for NLP
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Aaron Mueller | CLAMS: Cross-Linguistic Assessment of Models on Syntax | /N |
Documentation:
README.md on Github repository in English
Speech
Corpus,
Language Type:
Monolingual
Languages:
Russian
Availability:
Freely Available
License:
Size:
1240 hoursProduction Status:
Newly created-in progress
Use:
Speech Recognition/Understanding
-
Paper title:Golos: Russian Dataset for Speech Research
-
Paper track:8.13 Other topics in Speech Recognition: Signal Pr/Poster Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Nikolay Karpov | Golos: Russian Dataset for Speech Research | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Catalan Chinese English Esperanto French German Italian Kabyle Kinyarwanda Persian Polish Russian Spanish Welsh
Availability:
Freely Available
License:
Creative Commons license
Size:
8.8k hoursProduction Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
-
Paper track:8.1 Feature extraction and low-level feature model/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Laurent Besacier | Common Voice | /N |
Documentation:
https://arxiv.org/pdf/1912.06670.pdf, English, publicLanguage Type:
Multilingual
Languages:
Russian
Availability:
Freely Available
License:
<Not Specified>
Size:
5447 entries Production Status:
Newly created-in progress
Use:
Document Classification, Text categorisation
-
Paper title:Designing a Russian Idiom-Annotated Corpus
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Katsiaryna Aharodnik | CUNY Graduate Center | US |
| Author 2 | Anna Feldman | Montclair State University | US |
| Author 3 | Jing Peng | Montclair State University | US |
| Main Contact | Katsiaryna Aharodnik | CUNY Graduate Center | None |
Documentation:
<Not Specified>Language Type:
Trilingual
Languages:
English Farsi Russian
Availability:
From Owner
License:
<Not Specified>
Size:
397948 Production Status:
Existing-updated
Use:
Automatic metaphor recognition
-
Paper title:Automatic Expansion of the MRC Psycholinguistic Database Imageability Ratings
-
Paper track:Terminology
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country | ||||
|---|---|---|---|---|---|---|---|
| Author 1 | Ting Liu | ILS, University at Albany | US | ||||
| Author 10 | Nick Webb | Union College | US | ||||
| Author 11 | Umit Boz | University at Albany | US | ||||
| Author 12 | Ignacio Cases | University at Albany | US | ||||
| Author 13 | Ching-Sheng Lin | ILS Institute, University at Albany, SUNY | US | ||||
| Author 2 | Kit Cho | University of Houston-Downtown | None | SUNY - University at Albany | US | University at Albany | US |
| Author 3 | G. Aaron Broadwell | University at Albany | US | ||||
| Author 4 | Samira Shaikh | University at Albany | None | University at Albany | US | ||
| Author 5 | Tomek Strzalkowski | University at Albany | US | ||||
| Author 6 | John Lien | University at Albany | US | ||||
| Author 7 | Sarah Taylor | Sarah M. Taylor Consulting, LLC | US | ||||
| Author 8 | Laurie Feldman | University at Albany | US | ||||
| Author 9 | Boris Yamrom | University at Albany | US | ||||
| Main Contact | Ting Liu | ILS, University at Albany | None | Siena College | None |
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Multilingual
Languages:
Ancient Greek Arabic Chinese English Finnish Hebrew Korean Russian Swedish
Availability:
Freely Available
License:
CreativeCommons, Gnu
Size:
11814230 tokens Production Status:
Existing-used
Use:
Parsing and Tagging
-
Paper title:The (Non-)Utility of Structural Features in BiLSTM-based Dependency Parsers
-
Paper track:Long/Tagging, Chunking, Syntax and Parsing
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Agnieszka Falenska | Universal Dependencies 2.0 | /N |
Documentation:
https://universaldependencies.org/v2/
Written
Corpus,
Language Type:
Multilingual
Languages:
English French German Russian
Availability:
Freely Available
License:
Size:
None Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
-
Paper track:Long/Machine Translation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Elena Voita | WMT data | /N |
Documentation:
None
Written
Corpus,
Language Type:
Bilingual
Languages:
English Russian
Availability:
Freely Available
License:
Size:
None Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
-
Paper track:Long/Machine Translation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Elena Voita | OpenSubtitles | /N |
Documentation:
None
Written
Ontology,
Language Type:
Monolingual
Languages:
Russian
Availability:
Freely Available
License:
Size:
110 thousabd entries Production Status:
Existing-used
Use:
Information Extraction, Information Retrieval
-
Paper title:Corpus-based Check-up for Thesaurus
-
Paper track:Short/Resources and Evaluation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Natalia Loukachevitch | Russian wordnet RuWordNet | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Russian
Availability:
Freely Available
License:
MIT
Size:
300000 entries Production Status:
Newly created-finished
Use:
Document Classification, Text categorisation
-
Paper title:Large Dataset and Language Model Fun-Tuning for Humor Recognition
-
Paper track:Short/Resources and Evaluation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Pavel Braslavski | FUN | /N |
Documentation:
None




